Flagging Inland Data - Explore Grossrange

Summary

CMAR has collected data on several inland bodies of freshwater in Nova Scotia, including lakes and rivers.

CMAR intends to process and publish all inland data under a new “Inland” branch of the Coastal Monitoring Program. Data will be processed in a similar manner to the coastal water quality data, and data flags will be applied using the qaqcmar package.

It is suspected that sensors on some rivers were out of the water for some period of time during the deployment due to low water levels. Data flagging efforts will flag data for periods of time sensors were suspected to be exposed. During the periods in which sensors were exposed to air, recorded temperatures fluctuate more quickly than when sensors are submerged.

The purpose of this document is to help CMAR determine appropriate data flagging tests and thresholds for freshwater (inland) data. We do not currently have enough freshwater data to conduct as thorough an analysis as was done on the coastal water quality data to develop tests and thresholds, so thresholds may be picked in more subjective ways. Note, this initial threshold analysis has been completed on a subset of data.

Waterbodies included in threshold analysis:
[1] "Gold River"         "LaHave River"       "Musquodoboit River"
[4] "Roseway River"      "Round Hill River"   "Salmon River"      
[7] "Tusket River"       "Liscomb River"      "Mersey River"      
Stations included in threshold analysis:
 [1] "Gold River 2"         "LaHave River 1"       "LaHave River 3"      
 [4] "Musquodoboit River 1" "Musquodoboit River 2" "Musquodoboit River 3"
 [7] "Roseway River 1"      "Roseway River 2"      "Round Hill River 1"  
[10] "Round Hill River 2"   "Round Hill River 3"   "Salmon River 1"      
[13] "Salmon River 2"       "Tusket River 1"       "Tusket River 2"      
[16] "LaHave River 2"       "Liscomb River 1"      "Liscomb River 2"     
[19] "Mersey River 2"       "Tusket River 3"      
Stations which may have experienced air exposure:
  • Liscomb 1
  • Liscomb 2
  • LaHave 2
  • Mersey 2
  • Tusket 3
  • Possibly Musquodoboit 1 and 2

Data visualization

Station locations

Approximate location of stations included in thresholds analysis.

Plot uncleaned station data

Plot cleaned station data

Suspected outliers have been removed from the following datasets:

  • Liscomb 1
  • Liscomb 2
  • LaHave 2
  • Mersey 2
  • Tusket 3

The cleaned datasets will be used to generate the station statistics and grossrange thresholds.

Statistical overview

Distribution of observations

Summary stats

By station

Pooled

Mean and standard deviation

Quantiles

Calculate user grossrange threshold

Since the datasets are all normally distributed, mean + 3SD will be used to develop the grossrange thresholds. Stations have similar mean + SD, so data has been pooled to determine one set of user grossrange thresholds to be used to flag all inland datasets.

Note, the statistically derived thresholds may be misleading when applied to future datasets, because most of the training datasets do not include data from the full year. Very few of our inland datasets include data from winter months. Thus, for now it is recommended that CMAR user grossrange thresholds are ONLY applied to data collected between June and October (inclusive).

Mean_sd threshold table

Apply user grossrange threshold

Visualize flagged data - threshold subset

Flag datasets included in threshold analysis.

Visualize flagged data - all datasets

Flag all inland datasets